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ABSTRACT 

We present an efficient and robust approach for extracting clusters of galaxies from weak 
lensing survey data and measuring their properties. We use simple, physically-motivated clus- 
ter models appropriate for such sparse, noisy data, and incorporate our knowledge of the 
cluster mass function to optimise the detection of low-mass objects. Despite the method's 
non-linear nature, we are able to search at a rate of approximately half a square degree per 
hour on a single processor, making this technique a viable candidate for future wide-field sur- 
veys. We quantify, for two simulated data-sets, the accuracy of recovered cluster parameters, 
and discuss the completeness and purity of our shear-selected cluster catalogues. 
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1 INTRODUCTION 

Clusters of galaxies are the most massive gravitationally bound ob- 
jects in the universe and, as such, are critical tracers of the for- 
mation of large-scale structure. The number count of clusters as 
a function of their mass and redshift has been predicted bo th an- 
alytically (see e.g. lPress & Schechter|[l974l : IShefh et alj|200lh and 
from large scale n umerical simulations (see e.g. I Jenkins et al . 2001 ; 
lEvrard et alj|2002h . and are particular ly sensitive to the cos molog- 
ical parameters as and fi m (see e.g. iBattve &~W eller 2003|). The 
size and formation history of massive clusters is such that the ra- 
tio of gas mass to total mass is expected to be representative of the 
universal ratio fib/^m, once the relatively small amount of bary- 
onic matter in th e cluster galaxies is taken into account (see e.g. 
IWhiteetal.lll993h . 

The study of cosmic shear has rapidly progressed with the 
availability of high quality wide-field lensing data. Large dedi- 
cated surveys with ground-based telescopes have been employed 
to reconstruct the mass distribution of the universe and con- 
strain cosmological p arameters (see e.g. lMassev et alj[2007ll20o3 : 
Hoeks tra et alj l200q) . Weak lensing also allows one to detect 
galaxy clusters without making any assumptions about the baryon 
fraction, richness, morphology or dynamical state of the cluster, 
and so weak lensing cluster modelling would allow one to test 
these assumptions by observing the cluster with optical, X-ray or 
Sunyaev-ZeFdovich (SZ) methods. 

Despite the advances in data quality, weak lensing data re- 
mains very sparse and noisy. Hence, obtaining the shear signal from 
the shapes of the background galaxies is a very challenging task. 
Several grid-based methods have been devised to reconstruct the 
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projected mass distribution from the observed shear field (see e.g . 
I Kaiser & Squired! 19931 : ISauires & Kaiserlll996l : iBridle et alJI 19991 : 
IStarck et al.ll20061) . In these "non-parametric" methods, the model 
assumed is a grid of pix e ls wh ose values comprise the model pa- 
rameters. Mars hall et ail J2002h showed that such a large number 
of parameters is often not justified by the data quality - failure to 
recognise this can result in over-fitting and over-interpretation of 
the data. 

An alternative method for mass reconstruction is to work with 
simply-parameterised physically -motivated mo dels for the under- 
lying mass distribution ( Marsh all et al.1 [2003h . By fitting simple 
template cluster models to the observed data set, we can draw in- 
ferences about the cluster parameters directly. This involves cal- 
culating the probability distribution of these parameters, and also 
(perhaps) those of the background cosmology; we can also com- 
pare different cluster models, enhancing our astrophysical under- 
standing of these systems. This is most conveniently done through 
a Bayesian inference. 

In [Marshall et al] J2003h and iMarshalll d2006h a Bayesian 
approach was presented for such an analysis of weak lensing 
data from pointed observations towards known clusters. This used 
a highly effective, but computationally intensive, Markov Chain 
Monte Carlo (MCMC) sampler to explore the high-dimensional pa- 
rameter space, and employed the thermodynamic integration tech- 
nique to calculate the Bayesian evidence. In this paper, we ex- 
tend this work by utilizing the recently devel oped 'multimodal 
nested sampling' (MultiNest) technique jFeroz & Hobsonl2008l : 
Feroz et al. 2008), which is found to be ~ 200 times more efficient 
than traditional MCMC methods and thus enables one to search for 
multiple clusters in wide-field weak lensing data. MultiNest en- 
ables one to simultaneously detect clusters from the weak lensing 
data and perform the parameter estimation for the individual cluster 
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model parameters. Furthermore, following Hob son & McLachlanl 
d2003h . we also quantify our cluster detection through the applica- 
tion of Bayesian model selection using the Bayesian evidence value 
for each detected cluster, which can be easily calculated using the 
MULTlNEST technique. 

The outline of this paper is as follows. In Section [2] we de- 
scribe our methodology for detecting and characterising clusters in 
weak lensing survey data. In Section [3] we apply our method to a 
simple simulated data-set, before moving on to describe realistic 
cluster survey simulations and the results of our cluster extraction 
algorithm on these simulations in Section|4] Finally we present our 
conclusions in Section|5] 



2 METHODOLOGY 
2.1 Bayesian inference 

Our cluster detection methodology is built upon the principles of 
Bayesian inference, and so we begin by giving a brief summary of 
this framework. Bayesian inference methods provide a consistent 
approach to the estimation of a set of parameters © in a model (or 
hypothesis) H for the data D. Bayes' theorem states that 



Pr(0|D,#) = 



Pr(D| & y H)Pr(&\H) 



(1) 



Pr(D|#) 

where Pr(@|D,i/) = P(&) is the posterior probability distri- 
bution of the parameters, Pr(D|©, H) = £(©) is the likelihood, 
Pr(®\H) = tt(0) is the prior, and Pr(D|H) = Z is the Bayesian 
evidence. 

In parameter estimation, the normalising evidence factor is 
usually ignored, since it is independent of the parameters 0, and 
inferences are obtained by taking samples from the (unnormalised) 
posterior using standard MCMC sampling methods, where at equi- 
librium the chain contains a set of samples from the parameter 
space distributed according to the posterior. This posterior consti- 
tutes the complete Bayesian inference of the parameter values, and 
can be marginalised over each parameter to obtain individual pa- 
rameter constraints. 

In contrast to parameter estimation problems, for model se- 
lection the evidence takes the central role and is simply the factor 
required to normalize the posterior over ©: 



Z = j £(0)7r(0)d°0, 



(2) 



where D is the dimensionality of the parameter space. As the av- 
erage of the likelihood over the prior, the evidence is larger for 
a model if more of its parameter space is likely and smaller for a 
model with large areas in its parameter space having low likelihood 
values, even if the likelihood function is very highly peaked. Thus, 
the evidence automatically implements Occam's razor: a simpler 
theory with compact parameter space will have a larger evidence 
than a more complicated one, unless the latter is significantly bet- 
ter at explaining the data. The question of model selection between 
two models Ho and Hi can then be decided by comparing their 
respective posterior probabilities given the observed data set D, as 
follows 



R = 



Pr(#i|D) _ Pr(D|Hi)Pr(#i) _ Z x Pr(iT x ) 



Pr(# |D) Pr(D|#o)Pr(-ff ) Z Pt(H )' 



(3) 



where Pt(Hi)/ Pt(Ho) is the a priori probability ratio for the two 
models, which can often be set to unity but occasionally requires 
further consideration. 



Evaluation of the multidimensional integral in Eq. [2] is a 
challenging numerical task. Standard techniques like thermody- 
namic integration are extremely computationally expensive which 
makes evidence evaluation at least an order of magnitude more 
costly than parameter estimation. Some fast approximate meth- 
ods have been used for evidence evaluation, such as treating the 
posterior as a multivariate Gaussian centred at its peak (see e.g. 
Hobson & McLachlanl [2003h , but this approximation is clearly a 
poor one for multimodal posteriors (except perhaps if one performs 
a separate Gaussian approximation at each mode). The Savage- 
Dickey density ratio has also been proposed (see e.g. iTrottall 20071) 
as an exact, and potentially faster, means of evaluating evidences, 
but is restricted to the special case of nested hypotheses and a 
separable prior on the model parameters. Various alternative infor- 
mation crite ria for astrophysical model selection are discussed by 
lLiddleM2007» . but the evidence remains the prefer red method. 

The nested sampling approach, introduced by Skilling (2004), 
is a Monte Carlo method targeted at the efficient calculation of the 
evidence, but also prod uces posterior inferenc es as a by-product. 
iFeroz & Hobsonl d2008l) and lFeroz et"al] d2008h built on this nested 
sampling framework and have recently introduced the MULTlNEST 
algorithm which is very efficient in sampling from posteriors that 
may contain multiple modes and/or large (curving) degeneracies 
and also calculates the evidence. This technique has greatly re- 
duced the computational cost of Bayesian parameter estimation and 
model selection, and is employed in this paper. 



2.2 Weak lensing likelihood 

Our approach to detecting multiple clusters in weak lensing 
data follows the gene ri c ob ject detection strate gy advocated by 
Hobso n & McLachlanl d2003l) and refined by IFeroz & Hobsonl 
(2008). They show that the straightforward approach of using a 
single-object model for the data is both computationally far less 
demanding than adopting a multiple-object model and reliable, pro- 
vided that the objects of interest are spatially well-separated. It is 
important to understand that adopting a single-object model does 
not restrict one to detecting just a single cluster in the weak lensing 
data. Rather, by modelling the data in this way, one expects the pos- 
terior distribution to possess local maxima in the parameter space 
© of the single-cluster model, some of which will correspond to 
real clusters present in the data and some that occur because the 
pattern of ellipticities in the background galaxies 'conspire' to give 
the impression that a cluster might be present. The process of ob- 
ject detection and characterisation thus reduces to locating the local 
maxima of the posterior distribution in the parameter space © and 
deciding which of these local maxima correspond to a real cluster. 

A model cluster density profile can be determined from nu- 
merical iV-body simulations of large-scale structure formation in a 
ACDM unive rse. In particular, as suming spherical symmetry, the 
NFW profile dNavarro et al .11 19971) provides a good fit to the simu- 
lations and is given by 



p(r) = 



(r/r a )(l + r/r s ) 2 ' 



(4) 



where r s and p s are the radius and density at which the logarithmic 
slope breaks from —1 to —3. The mass M200 contained within the 
radius r^oo at which the density is 200 times the cosmological crit- 
ical density p cr jt at the redshift of the cluster can be calculated as 
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jEvrard et alJl2002l ; lAllen et al]2003h : 

M 200 



(4/3)7rr 2 3 00 



200p crit , 



47rp s r s 



log(l + c) 



1+c 



(5) 



where c = T2oo /r s is a measure of the halo concentration. Thus, 
we take as our cluster parameters © = (x c , y c , M200, c, z), where 
x c and y c are the spatial coordinates at which the cluster is centred, 
and 2 is its redshift. 



A cluster mass distribution is investigate d using weak gravita- 
tional lensing through the relationship (see e.g. lSchramrn & Kavsed 
E993): 



<e(x)) =g(a 



(6) 



that is, at any point x on the sky, the local average of the complex 
ellipticities e = ei + ie2 of a collection of background galaxy im- 
ages is an unbiased estimator of the local complex reduced shear 
field, g = g\ + ig2, due to the cluster. Adopting the thin-lens ap- 
proximation, for a projected mass distribution E(as) in the lens, the 
reduced shear g(x) is defined as 



g(x) = 



1 — k{x) ' 

where the convergence k(x) is given by 

E(aO 
«(») = ^ 



(7) 



(8) 



and the shear y(x) can, in general, be w ritten as a convolut ion in- 
tegral over the convergence k(x) (see e.g. lBridle et aljl 999). £ cr it 
is the critical surface mass density 



4tvG DiD h 



(9) 



where D s , D\ and D\ B are the angular-diameter distances between, 
respectively, the observer and each galaxy, the observer and the 
lens, and the lens and each galaxy. In general, the redshifts of each 
background galaxy can be different, but are assumed to be known. 
The lensing effect is said to be weak or strong if « <C 1 or k > 1 re- 
spectively. Analytic formulae for the convergence and shear fields 
produced by a clust e r with an NFW profi l e have b een calculated by 
iBartelmanrj dl996h : Iwright & Brainerd l l2000h : iMeneghetti et all 



1 2003) and we make use of these here to reduce computational 
costs. 

The observed complex ellipticity components of the iV ga i 
background galaxies can be ordered into a data vector d with com- 
ponents 

f Re(ei) (i < ATgai) 

d % = I . (10) 

[ Im(ei_jv gal ) (N saX + 1 < i < 2JVg a i) 

Likewise the corresponding components of the complex reduced 
shear g(xi) at each galaxy position, as predicted by the cluster 
model, can be arranged into the predicted data vector d p , with the 
arrangement of components matching Eq.UOl 

The uncertainty on the measured ellipticity components con- 
sists of two contributions. The intrinsic ellipticity components of 
the background galaxies (i.e. prior to lensing) may be taken as hav- 
ing been drawn independently from a Gaussian distribution with 
mean zero and variance af nt . Moreover, the effect of errors in the 
measured ellipticity components introduced by the galaxy shape 
estimation procedure can be modelled as Gaussian with mean zero 



and variance u^ hs . Assuming the intrinsic and observational contri- 
butions to the uncertai nty are independent, one can simply add th e 
two variances (see e.g. lHoekstra et al1l200d : [Marshall et alj|2003l) . 
This leads to a diagonal noise covariance matrix C on the ellipticity 
components, such that the elements corresponding to the ith galaxy 
arc 



2 

°"obs 



t [l - max(\g(xi)\ , l/\g(xi)\ )] 



(11) 



The term inside the square brackets is the correction for the galax- 
ies lying very close to the critical regions of the strong lensing 
cl uster as suggested b y ISchneider et al.l j2000h and implemented 
bv lBradac et aLlfe004h. 



As shown by Marshall et al. ( 200 J, we can then write the like 
lihood function as 

1 



£(©) 



cx p(-ix 2 ), 



where \ is the usual misfit statistic 

x 2 = (d-d p ^ 



JVgal 2 

= ££■ 

i=i j=i 

and the normalisation factor is 



■gj{xi)) 



Z L = (2tt) 



1/2 |C 1 1/2 



(12) 



(13) 



(14) 



(15) 



Note that is it necessary to include this normalisation factor in the 
likelihood, since the covariance matrix C is not constant, but de- 
pends on the cluster model parameters through the predicted shear 
terms in Eg. 1111 



2.3 Priors on cluster parameters 

To determine the model completely it only remains to specify the 
prior 7r(@) on the cluster parameters = (x c , y c , A/200, c , z). 
Throughout this paper we assume the prior to be partly separable, 
such that 



7r(0) = 7r(s c )7r(?/ c )7r(c)7r(M2oo, 



(16) 



We assume uniform priors on the position parameters x c and y c 
over ranges that slightly exceed the patch of sky for which lensing 
data is available, so that the model has the flexibility of having clus- 
ters that lie slightly outside the observed region. This is necessary 
since such clusters can still have measurable effects on the observed 
shear field. We also adopt the uniform prior tt(c) = U(0, 15) on the 
cluster concentration parameter. 

For the mass M200 (which we will denote henceforth by M for 
brevity) and redshift z parameters, we adopt a joint prior based on 
the Press-Schechter jPress & Schechten 19741) mass function within 
some ranges M min < M < M max and z min < z < z max . The 
chosen minimum and maximum values of these ranges used in 
our analysis of simulated weak-lensing data are discussed below. 
Numerical simulations have shown that the Press-Schechter mass 
function over-estimates the abund ance of high-mas s clusters and 
under-estimates those of low mass dSheth et al.l200ll) , but overall it 
still p rovides an adequate fit to iV-body simulations ( I Jenkins et al.l 
l200ll) . In particular, we assume the Press-Schecter mass function 
with erg = 0.8, which is plotted in FigureQ] along with some sam- 
ples drawn from it for illustration. In principle, one could allow erg 
to be an additional free parameter in our model that we attempt to 
constrain simultaneously with the parameters describing individual 
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redshift z 

Figure 1. The Press-Schechter mass function with as = 0.8, together with 
some samples drawn from it for illustration. The contours enclose 68%, 
95% and 99% of the probability. 



clusters. We have not pursued this possibility in this first applica- 
tion of our method, but it will be explored in a future work. It should 
also be remembered that our adoption of the Press-Schechter mass 
function as a prior still allows for the possibility of detecting clus- 
ters with masses and redshifts that lie outside the ranges favoured 
by this mass function, provided the data, through the likelihood 
function, are sufficiently conclusive. Finally, we note that is it triv- 
ial to replace the assumed Press-Schechter mass function in our 
analysis by any other mass function, if so desired. 



2.4 Quantifying cluster detection 

As mentioned above, one expects the posterior distribution P (0) 
in the parameter space of our single-cluster model to possess nu- 
merous local maxima, some corresponding to real clusters and 
some occurring because the pattern of ellipticities in the back- 
ground galaxies 'conspire' to give the impression that a cluster 
might be present. We now discuss how one may calculate the prob- 
ability that an identified local peak in the posterior corresponds to 
a real cluster (or set of clusters), thereby quantifying our cluster 
detection methodology. 

This quantification is most naturally performed via a Bayesian 
model selection by evaluating the evidence associated with each 
local posterior peak for com peting models for the data (see e.g. 
iHobson & McLachlanl [20031) . For each peak, it is convenient to 
consider the following hypotheses: 

Ho = 'no cluster with M > M\i m is centred in S\ 

Hi = 'at least one cluster with M > Mum is centred in 5", 

where S is taken to be a region just enclosing (to a good approxima- 
tion) the posterior peak in the spatial subspace x c = (x c , y c ), and 
Mii m is a lower limiting mass of interest that we discuss in more 
detail below. It is straightforward to see that Ho and Hi are mutu- 
ally exclusive and exhaustive hypotheses. The null hypotheses Ho 
is, however, the union of two other mutually exclusive hypothesis, 



i.e. Ho = H% U Hz, where 

H2 = 'at least one cluster with < M ^ Afiim is centred in S', 
Hz = 'no cluster is centred in S". 

Note that Hz is equivalent to considering clusters of zero mass. 
Moreover, if one choose Mu m = 0, then H2 becomes an empty 
hypothesis and Ho = Hz. 

For each posterior peak, we must calculate the model selection 
ratio R given in Eq.[3]between the hypotheses Ho and Hi. Using 
Bayes' theorem and the fact that Ho = H2 U Hz, with H2 and Hz 
being mutually exclusive, it is easy to show that R becomes 



R = 



Zi 



Pr(Hi 



Z 2 



1 + 



Pr(# 3 



Pr(H 2 ) 



+ Zz 



Pr(# 2 



Pr(H 3 



- 1 Pr(if ) ' 



(17) 



Note that for the special case Mvaa = 0, for which P(H2) = 
and Ho = Hz, the ratio R correctly reduces to the right-hand side 
ofEq.[3] 

For each hypothesis Hi (i = 1, 2, 3), the evidence is given by 



where 



7Ti(0) = ■K l (x c )n i (c)ir i (M, z), 



(18) 



(19) 



for i = 1, 2, 3 are priors that define the hypotheses. In particu- 
lar, the priors on the position of the cluster centre may be taken 
in all cases to be uniform: iTi(x c ) = 1/|5| for all i, where \S\ is 
the area of the region S. Similarly, for all hypotheses, the prior 
on cluster concentration may be taken 7Ti(e) = U(0, 15). Dif- 
ferences between the priors for the hypotheses do, however, oc- 
cur in the joint priors m(M, z) on the cluster mass and redshift. 
For hypothesis Hi, the prior is the appropriately normalised Press- 
Schechter mass function over the ranges Mij m < M ^ M max and 
Zmin < z ^ z m ax, and is zero otherwise^ For H2, the prior is the 
appropriately normalised Press-Schechter mass function over the 
ranges < M ^ Mn m and z m i n < z ^ z m ax, and is zero other- 
wise. Finally, for Hz, the prior is ttz{M, z) = 8(M)nz(z), where 
8(M) is the Dirac delta function centred on M — and ttz(z) can 
be any normalised distribution. 

Assuming the above priors, the evidence for each hypothesis 
can be written 



Z l (S) = |iy j P t (x c ) d 2 x c , 



(20) 



where we have defined the (unnormalised) two-dimensional 
marginal posterior 



Pi(x c ) = 



C(x c , c, M, z) TVi(c) iVi(M, z) dcdM dz. (21) 



The evidences (Eq.l20t for i — 1, 2 are easily obtained using the 
MultiNest algorithm, which automatically identifies local peaks 
in the post erior and evaluates the 'local' evidence as sociated with 
each peak dFeroz&HobsorJl2008l ; Feroz et al. 2008). One minor 
subtlety is that, when analysing the weak-lensing survey data, the 
uniform prior on the spatial position x c for clusters extends over 



1 Recall that, in any case, the Press-Schechter prior is assumed to be zero 
outside the ranges Af min < M < M max and z min < z < ^-max, as men- 



tioned in Section |23| Thus, even for non-zero Mji m , if My m 
then H2 becomes an empty hypothesis and Ho = Hz ■ 
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a region SI that slightly exceeds the full patch of sky for which 
lensing data is available, i.e. tt(x c ) = as discussed in Sec- 

tion ^. 3t . Thus, the local evidence returned by MultiNest as as- 
sociated with some posterior peak must me multiplied by |f2|/|S| 
to obtain the corresponding evidence (Eq.l20t. Finally, the evidence 
Zs(S) can be calculated directly without the need for any sampling 



MS) 



i 

W\ 



Co d x c = -Co, 



(22) 



since Co = C(x c ,c,M = 0, z) is, in fact, independent of the 
cluster parameters c and z and the priors are normalised. Note that 
Z$ is, in fact, independent of S. 

So far we have not addressed the prior ratios Pr(-Hi)/ Px(Ho) 
and Pr(i?2)/ Pr(Hs) in Eq. [T7] For the sake of illustration and 
simplicity, let us assume that the clusters are randomly distributed 
in spatial position. This is not entirely correct due to clustering of 
the galaxy clusters on large scales. Nonetheless, the departure from 
a random distribution in small fields is not expected to be signif- 
icant. First, consider the ratio Pr(Hi)/ Pr(Ho). If Ms is the (in 
general non-integer) expected number of clusters with M > Mu m 
centred in a region of size \S\, then the probability of there being 
N such clusters in the region is Poisson distributed: 



Pr(JV|A*s) = 



(23) 



Thus, bearing in mind the above definitions of Ho and Hi, we have 
Pt(Hi) 



Pt(Ho) CXP ^ S ' - 1 ~ MS for MS < 1, 
where /is is given in terms of the mass function by 

" ,ax dn 



Ms = \S\ 



dMdz 



dM dz, 



(24) 



(25) 



where dn/dMdz is the distribution of the projected number den- 
sity of clusters with masses between M and dM and redshift be- 
tween 2 and dz per unit area. 

Let us now consider the ratio Pr(H2)/ Pr(Hs) Similarly, let 
us suppose that Ag is the (in general non-integer) expected number 
of clusters with < M < Mu m centered in a region of size \S\. 
Then by the same argument as above 



Pr(H 2 



= exp(As) — 1 w As for A s < 1, 



Pr(H 3 ) 

where Ag is given in terms of the mass function by 

dn 



As = \S\ 



n 



dMdz 



■ dM dz. 



(26) 



(27) 



We are thus able to calculate the model selection ratio R in 
Eq. [T7] for each local posterior peak, which gives us the relative 
probability for obtaining a 'true' cluster detection (-ffi) as opposed 
to a 'false' one (Ho)- In particular, if the fcth local posterior peak 
has a ratio Rk, then the probability that this is a 'true' cluster de- 
tection is 



Pk = 



(28) 



1 + R k ' 

If we define a 'threshold probability' p t h, such that detections with 
Pk ■? Pth are identified as candidate clusters, the expected number 
of false positives, (npp) can then be calculated as 



(n FP ) 



E a 



Pk) 



(29) 



fc=l,Pfe>Pth 



where K is total number of detected posterior peaks. The expected 
'purity' of the resulting cluster sample, defined as the fraction of 
the cluster candidates that are 'true', can be similarly calculated. 

The choice of threshold probability p t h depends on the appli- 
cation. A lower value of p t h will obviously result in a lower purity 
but higher completeness, especially at the low-mass end where the 
lensing signal is particularly weak. In the presence of more infor- 
mation (e.g. the multi-band photometry) a lower purity but higher 
completeness might be preferable and hence, p t h should be set to 
a lower value in such cases. In the absence of any additional in- 
formation for the chosen survey field, p t h can be chosen so that 
the expected purity is relatively higher. We set p t h = 0.5 in this 
work, where we assume no additional information on the weak 
lensing simulations under analysis. This choice of p t h ensures that 
all the detections with a higher probability of being 'true' than be- 
ing 'false' are identified as candidate clusters. We discuss the im- 
pact of pth on the completeness and purity of the shear selected 
cluster sample derived from simulated weak-lensing data in Sec- 
tion|3] 



2.5 Estimation of individual cluster parameters 

Once a local posterior peak has been detected and identified as a 
cluster candidate using the above method, the values of the pa- 
rameters = (x c ,c, M, z) associated with that cluster are eas- 
ily calculated using the MultiNest algorithm. The algorithm 
identifies the samples associated with each posterior peak and al- 
lows one to draw posterior inferences u sing just these samples 
dFeroz & Hobsonll2008l ; [Feroz et alj|2008l) . Thus, for example, one 
can construct full one-dimensional marginalised posterior distribu- 
tions for each cluster parameter, from which best-fit values and un- 
certainties are trivially obtained. 



3 APPLICATION TO TOY WEAK-LENSING 
SIMULATIONS 

3.1 Simple weak-lensing survey simulation 

Before applying our methodology to weak-lensing data derived 
from full iV-body simulations in Section [4] we first demonstrate 
its performance in the ideal case, where the clusters have been sim- 
ulated using exactly the same model as the one we assume for the 
analysis. This is helpful in validating our approach. 

We simulate ten spherically-symmetric clusters, each with 
an NFW density profile, distributed uniformly in a 2000 x 2000 
arcsec 2 field, with cluster masses and redshifts drawn from the 
Press-Schechter mass function with M > 5 x 10 13 h~ 1 Mq and 
^ z ^ 1. We assume a ACDM cosmology with fi m = 0.3, 
SI a = 0.7, o~g — 0.8 and Hubble parameter h = 0.7. The con- 
centration parameter c for each cluster was drawn from a uniform 
distribution U(0, 15). The true cluster parameters are listed in Ta- 
ble[T]and the resulting convergence map is shown in Figure [2] (left 
panel). 

We consider the data only in the middle 1800 x 1800 arcsec 2 
region of the simulation in order to allow the cluster centers to lie 
outside the region for which data is available. We down-sample the 
resulting convergence and shear maps on a 256 x 256 pixel grid 
and add Gaussian noise with standard deviation a n = a,J \Jn 3 A, 
where n g is average number of galaxies per arcmin 2 and A is the 
pixel area. We assume a t ~ 0.3 and n g ~ 100 gal/arcmin 2 as 
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Figure 2. Left: true convergence map of the simulated clusters listed in Table[T] Middle: reconstructed convergence map obtained using the LenEnt2 algorithm. 
Right: inferred convergence map obtained using the MULTlNEST algorithm. 





x/arcsec 


y/arcsec 
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1 


338.9 


81.5 


7.3 x 10 13 


6.2 


0.57 


2 


-691.2 


-951.4 


9.3 x 10 13 


6.9 


0.21 


3 


-539.1 


785.3 


5.9 x 10 13 


4.0 


0.50 


4 


-452.6 


963.0 


5.0 x 10 13 


9.5 


0.35 


5 


-345.6 


-245.3 


5.1 x 10 13 


2.1 


0.92 


6 


-757.1 


-998.7 


5.4 x 10 13 


2.9 


0.47 


7 


877.8 


399.9 


1.9 x 10 14 


9.3 


0.47 


8 


-266.5 


483.2 


3.2 x 10 14 


9.4 


0.37 


9 


223.6 


308.4 


5.3 x 10 13 


11.1 


0.57 


10 


437.9 


-222.5 


8.6 x 10 13 


14.9 


0.53 



Table 1. True cluster parameters for the simple simulation discussed in Sec- 
tion[3~T1 



is approximately found for space-based weak lensing surveys. We 
assume all the background galaxies to lie at redshift z — 1. 



3.2 Analysis and results 

We applied our cluster finding methodology, using the MULTI- 
NEST algorithm, to this simple weak lensing survey simulation. 
As mentioned in Section |231 we impose uniform priors on clus- 
ter positions with the range of the priors set to be slightly wider 
than the range for which the data is available. We also (correctly) 
impose a uniform prior U (0, 15) on the concentration parameter. 
For the mass and redshift, we apply a joint prior coming from the 
Press-Schechter mass function with ^ z ^ 1 and 5 x 10 13 ^ 
M/h~ 1 MQ sC 5 x 10 15 . 

The MULTlNEST algorithm identified 34 posterior peaks. 
For each peak, the probability that it corresponds to a real clus- 
ter was calculated as discussed in Section 12.41 Mum was set to 
5 x 10 15 /i -1 Mq. Adopting a threshold probability p tb = 0.5, 
10 candidate clusters were identified. The corresponding inferred 
convergence map, made from the mean of 100 convergence maps 
with cluster parameter values drawn from their respective posterior 
distributions for the candidate clusters, is shown in the right-hand 



panel of Figure[2] For comparison, in the centre panel we also show 
the convergence map reconstructed using the LensEnt2 algorithm 
dMarshall et~aT]|2003l) . 

From Figure[2]one sees that 7 of the 10 candidate clusters cor- 
respond to real clusters. We must however, note that the two over- 
lapping clusters 2 and 6 in Table [TJ have been confused to yield 
a single detected cluster. Of the 10 candidate clusters, 3 were ac- 
tually false positives, but we note that R ~ 1 for these clusters, 
corresponding to equal probability of these clusters being either 
true or spurious. The 2 real clusters that were not detected (clusters 
4 and 5 in Table [T) are the least massive clusters in the simulation 
and consequently contribute a very small lensing signal. In fact, 
cluster 5 was detected by MULTlNEST but had probability ratio 
R < 1, corresponding to the probability of it being 'true' of less 
than 0.5; consequently it was not identified as a candidate cluster 
withp th = 0.5. 

The inferred parameter values, with 68 per cent confidence 
limits, for each of the 32 detected posterior peaks are shown in Fig- 
ure [3] (top panels). Also shown (bottom panels) is the logi? value 
and mass M of each posterior peak. Figure [4] shows the values of 
the inferred parameters (in red) for peaks with R > 1, and hence 
identified as cluster candidates (withp t h = 0.5), together with the 
true parameter values (in blue) for the corresponding cluster. 

In Figure [5] we also plot the expected and actual number of 
false positives obtained as a function of the 'threshold probability' 
Pth- The close agreement between the two curves indicates that our 
quantification procedure for cluster identification is extremely ro- 
bust. The corresponding purity and completeness as a function of 
Pth are plotted in Figure [6] 

Finally, we calcu late the Recei ver Operating Characteristic 
(ROC) curve (see e.g. lFawcett|[2006T) for our analysis procedure. 
The ROC curve provides a very reliable way of selecting the op- 
timal algorithm in signal detection theory. We employ ROC curve 
here to analyse our cluster candidate identification criterion, based 
on the threshold probability p t ^. The ROC curve plots the True Pos- 
itive Rate (TPR) against the False Positive Rate (FPR) as a function 
of the threshold probability. TPR is the ratio of the number of true 
positives for a given p t h to the number of true positives in all the 
detected clusters (i.e. for p t h = 0) which for the present analysis 
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Figure 3. Top panels: inferred parameters values and 68 per cent confidence intervals for each of the posterior peaks detected by the MULTlNEST algorithm 
when applied to the simple simulation discussed in Section |3T| Bottom panels: the logi? value and mass M of detected posterior peak. In both cases, peaks 
with R > 1 are plotted in red and the remaining peaks in blue. 
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Figure 4. Inferred parameter values with 68 per cent confidence limits (red) compared to true parameter values (blue) for each candidate cluster. 



is 8 corresponding to 8 clusters that had a strong enough lensing 
signal to be identified by our algorithm. Similarly, FPR is the ratio 
of the number of false positives for a given p t h to the number of 
false positives for p t h = 0. The best possible method would yield 
a point in the upper left comer of the ROC space, with coordinates 
(0,1). A completely random guess would, on average, yield a point 
on the diagonal line. The ROC curve traced out as a function of p t h 
for our cluster detection methodology is plotted in Figure [7] It can 
bee seen that although, p t h = 0.6 gives the optimal cluster candi- 



date identification criteria, pth = 0.5 is very nearly as good. The 
accuracy of an algorithm can be measured by calculating the area 
under the ROC curve. An area of 1 represents a perfect algorithm 
while an area of 0.5 represents a worthless algorithm. We notice 
that the area under the ROC curve in Figure|7]is very close to 1, 
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Figure 5. The expected and actual number of false positives as function of 
the 'threshold probability' p t ^ (see Section l2~4t . 
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Figure 7. Receiver Operating Characteristic (ROC) curve obtained by vary- 
ing the threshold probability p t ^ used in identifying candidate clusters for 
the analysis of the simple simulated weak lensing data-set discussed in Sec- 
tion !3.2l The points are labelled with their corresponding p t ^ value. 
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Figure 6. Purity and completeness as a function of the 'threshold probabil- 
ity' for the analysis of the simple simulated weak lensing data described in 
Section[X2l 



4 APPLICATION TO REALISTIC WEAK-LENSING 
SIMULATIONS 

We now describe the results of our cluster finding algorithm when 
applied to simulated weak-lensing survey data derived from numer- 
ical TV-body simulations. 



4.1 Realistic weak-lensing survey simulation 

In this case, the weak-lensing data is simulated using by ray-tracing 
through a num erical TV-body simulation of structure formation 
dWhitel d2005k covering a 3 x 3 degree field of view. The cos- 
mological model is taken to be a concordance ACDM model with 
parameters fi m = 0.3, J1a = 0.8, as = 0.9 and the Hubble 
parameter h = 0.7. The simulations employed 384 3 equal mass 
(10 10 A/q/i -1 ) dark matter particles in a periodic cubical box of 
side 2Q0h~ 1 Mpc that was evolved to z = 0. The redshift distribu- 



p{z) 



(30) 



with zq — 1.0 and b = 1.5. 

There are roughly 200 galaxies per arcmin 2 in the simulation, 
which represents the deepest space-based surveys. In practice, the 
signal-to-noise magnitude threshold gives around 100 galaxies per 
square arcminute and then insisting on knowing the photometric 
redshift of each galaxy pushes the number down to 40 — 70 per 
arcmin 2 . Since there is no colour-magnitude information for the 
galaxies available in these simulations, we pick 65 galaxies per 
square arcminute randomly from the galaxy catalogue. The intrin- 
sic ellipticity distribution of the galaxies is assumed to be Gaussian 
with (Tint = 0.25. We add Gaussian observational noise with stan- 
dard deviation c b s = 0.20 to the sheared e llipticity of each galaxy. 

Following Hennawi & Spergell J2005h . we define all the ha- 
los with A/200 > 10 13,5 /i _1 Mq as candidates to be identified 
by weak-lensing, since the finite number of background galax- 
ies and their intrinsic ellipticities places a lower limit on the 
mass of the halo that can be detected. The halo catalogue for 
the si mulation, produced us ing the Friends-of-Friends algorithm 
(FoF) tefstathiou et~ai]|l985h , contains 1368 halos with 10 13 5 < 
M 2 oo/h- 1 M Q < 7.7 x 10 14 and < z < 2.7. 



4.2 Analysis and results 

Modelling the entire 3x3 degree 2 field of view represents a 
highly computationally expensive task and the MultiNest algo- 
rithm would require a prohibitively large number of live points to 
be able detect such a great many clusters. We therefore divide the 
data into 16 patches of 0.75 x 0.75 degree 2 each and use 4000 
live points to analyse each patch. As before, we expand the ranges 
of uniform priors on the position of cluster centers to allow them to 
lie a little outside their respective patches, and we use uniform prior 
U(0, 15) for the concentration parameter. In order to determine the 
effect Press-Schechter mass function prior has on the detectability 
of clusters, as well as on the inferred cluster parameters, we per- 
form the analysis using the joint Press-Schechter prior on M200 
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Figure 8. The total number of clusters in a given redshift and mass bin 
in the true catalogue (red) against the number of detected clusters (green) 
for the analysis with weak lensing simulation discussed in Section l4~Tl with 
p t h = 0.5. The Press-Schechter mass function was used as ajoint prior on 
A/200 and z. 
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Figure 9. Completeness (red) and purity (green) for the analysis with weak 
lensing simulation discussed in Section |4~71 with p t ^ = 0.5. The Press- 
Schechter mass function was used as ajoint prior on M200 and 2. 



and z, as well as a logarithmic prior on M200 and uniform prior on 
z. In both cases, we assume the redshift range < z ^ 4 and the 
mass range 10 13 ' 5 < M 2 oo/{h' 1 M Q ) ^5x 10 15 . 

Before discussing the results from applying MultiNest to 
the weak-lensing simulation, we note that the 'true' cluster cata- 
logue for iV-body simulation are, i n fact, inferred using a halo find- 
ing algorithm. The FoF algorithm Efstathi ou et al.l l ll985t) is most 
often employed for this purpose and works by associating all the 
particles within a distance of one another that is some factor b of 
mean distance between the particles. The number of halos identi- 
fied thus depends very strongly on the linking parameter b. Depend- 
ing on the resolution of the iV-body simulation and the value of b, 
FoF may or may not resolve out structures present within clusters. 
MultiNest technique is very sensitive to any structure present 
within clusters and will classify these structures as individual clus- 
ters. Therefore, there is always a possibility of MultiNest mak- 
ing an identification of two clusters very close to each other while 
the FoF algorithm would identified these as one cluster; this would 
result in a lower purity for the MultiNest catalogue. 



1 — 1 06 




Figure 10. Completeness in mass-redshift plane for the analysis of the 
weak-lensing simulation discussed in Section |4~T1 with p t h = 0.5. The 
Press-Schechter mass function was used as ajoint prior on A/2oo and z. 



4.2.1 Completeness and purity 

MULTINEST found around 600 halos out of which 293 and 268 ha- 
los were identified as candidate clusters using pth = 0.5 with the 
Press-Schechter and log-uniform priors respectively. Catalogues 
matching was performed for each of these cluster candidates by 
finding the closest cluster in the true cluster catalogue for which 
all the cluster parameters lie within 4-cr of the inferred mean val- 
ues. Any cluster candidate not having such a corresponding cluster 
in the true catalogue was identified as a false positive. Using this 
matching scheme, 187 and 175 cluster candidates were identified 
as true positives, giving a 'purity' 64% and 65%, for the Press- 
Schechter and log-uniform priors respectively. For the analysis with 
the Press-Schechter prior, we plot the number of clusters in the true 
catalogue and detected clusters as function of true cluster M200 
and z in Figure[8] We plot purity and completeness as a function of 
cluster A/200 and z in Figure|9] In FigurefTOlwe plot the complete- 
ness in the mass-redshift plane, for A/200 < 3 x 10 14 /i -1 A/q and 
z < 1.5, since there are very few clusters in the true catalogue out- 
side these ranges and consequently we suffer from small number 
statistics. 

From Figures [9] and [TUJ it is clear that the completeness of 
our shear-selected cluster sample approaches unity only for mas- 
sive cl usters with A/200 ~ 5 x lO 14 fe~ 1 A/0. lHennawi & Spergell 
d2005h reached a similar conclusion using the Tomographic 
Matched Filtering (TMF) scheme. U nfortunately, a direct compar- 
ison with lHennawi & S pergel ( 20051) is not possible, as they used a 



different iV-body simula tion. Some o t her pr e vious studies of shear- 
selected cluster samples Iwhite et alj f2002l) ; lHamana et al.l ( 2004) 
have also come to the same conclusion that they suffer from severe 
incompleteness except at the high-mass end. 

In Figure[TT]we plot completeness and purity as a function of 
threshold probability p t h for the analysis with the Press-Schechter 
prior. We notice that even forp t h ~ 1, purity is around 0.7, while 
one would expect it to be very close to unity. This discrepancy oc- 
curs because of the presence of sub-structure in high-mass clusters. 
As discussed in the previous section, the MultiNest algorithm is 
very sensitive to any sub-structure within clusters, and identifies as 
separate clusters the halos that the FoF algorithm may or may not 
identify as belonging to a single cluster, depending on the impact 
parameter b. 



4.2.2 Accuracy of parameter estimates 

We now discuss the accuracy of recovered parameters of the de- 
tected clusters. The 'true' cluster catalogues did not have the con- 
centration parameter and therefore we do not discuss the accuracy 
of inferred concentration of each cluster. 

For p t h = 0.5, in Figures[l2][T3]and[l4] respectively, we plot 
for each detected cluster: the inferred cluster parameters against 
the true cluster parameters; the difference between the inferred pa- 
rameters and the true parameters; and the difference between the 
inferred parameters and the true parameters against the true param- 
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Figure 12. Inferred parameter values (on the y-axis) against the 'true' parameter values (on the x-axis) for each of the detected clusters. The analysis was 
performed with p t ^ = 0.5 using (a) the Press-Schechter mass function prior on M200 and z and (b) a logarithmic prior on A/200 ana " uniform prior on z. 
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Figure 13. Difference between the inferred parameter values and the true parameter values from the cluster catalogue, plotted for each detected cluster. The 
analysis was performed with p t ^ = 0.5 using (a) the Press-Schechter mass function prior on M200 and z and (b) a logarithmic prior on M200 and uniform 
prior on z. 
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Figure 14. Difference between the inferred parameter values and the true parameter values from the cluster catalogue, plotted against the true parameter values 
for each of the detected clusters. The analysis was performed with p t h = 0.5 using (a) the Press-Schechter mass function prior on M200 and z and (b) a 
logarithmic prior on A^oo and uniform prior on z. 
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ability' for the analysis of the simulated weak-lensing data described in 
Section l4~2l The Press-Schechter mass function was used as a joint prior on 
M 2 oo and z. 



eters. Each figure contains the results for the Press-Schechter and 
log-uniform priors. 

It can be seen from these figures that the cluster positions have 
been reasonably well estimated, but the inferred cluster masses 
have generally been positively biased for both Press-Schechter and 
logarithmic priors on M2oo- For logarithmic priors on M200 this 
bias is particularly strong. Applying the Press-Schechter prior does 
result in better parameter estimates for cluster masses but neverthe- 
less, the inferred masses of the low-mass clusters have still been 
over-estimated. This result agrees well with what is already known 
from the iV-body simulations, the Press-Schechter mass function 
over-estimates the abundance of high-mass clusters and under- 
estimates the abundance of low-mass clusters and consequently, the 
inferred masses of the low mass clusters have been over-estimated. 
This highlights the importance of having physically motivated pri- 
ors for parameterising clusters in the weak lensing data-sets. This 
biasing of cluster masses, particularly at the low-mass end points 
to the fact that although the Press-Schechter mass function gives 
better estimates and reduces the bias in cluster parameters as com- 
pared with a logarithmic prior on M200, it still does not fit this 
particular iV-body simulation extremely well. The inferred cluster 
redshifts have a large scatter for both the Press-Schechter and log- 
arithmic priors on M200. This is because z is correlated with M200 
and hence some additional information is required to break this de- 
generacy and get better estimates for both M200 and z. 

We further note that we have assumed a spherical NFW model 
for the cluster mass profiles, while several studie s have shown that 
the clusters are n ot necessarily spherical (see e .g. lShaw et al .120061 : 
iBett et al.l2007l) . |Corless & Kind J2007l . l2008h have recently shown 
that ignoring the triaxial 3D structure can result in inaccuracies in 
cluster parameter estimates. We plan to include the triaxial NFW 
mass profile in a future work to assess the importance of including 
3D triaxial structure on cluster parameter estimation. 



5 CONCLUSIONS 

We have introduced a very efficient and robust approach to detect- 
ing galaxy clusters in wide-field weak lensing data-sets. This ap- 



proach allows the parameterisation of each detected cluster. Fur- 
thermore, using Bayesian model selection, one can also calculate 
the probability odds for each detected cluster being 'true'. This 
quantification of cluster detection allows flexibility in determining 
the cluster candidate selection criterion depending on the applica- 
tion. Inspite of the non-linear nature of the analysis method, we 
are able to search at a rate of one-half square degree per hour on 
a single processor. The code is fully parallel, making this a viable 
technique even for the deepest weak-lensing surveys. 

An application of our algorithm to simulated weak-lensing 
data derived by an iV-body simulation showed that the shear- 
selected cluster sample suffers from severe incompleteness at the 
low-mass and high-redshift ends of the cluster distribution, with 
the completeness approaching unity only for massive clusters with 
M200 ~ 5 x 10 14 h~ 1 Mq. We also demonstrated the importance 
of the priors in estimating the masses and redshifts of low-mass 
clusters, since the lensing signal produced by them is particularly 
weak. We used the Press-Schechter mass function as a prior on 
cluster masses and redshift in this work and found it to be produce 
inferred masses of the low-mass clusters that are biassed low. 

In a future study, we intend to extend this work by including 
the triaxial NFW mass profile to investigate what fraction of clus- 
ters show significant triaxiality in the ACDM model iV-body sim- 
ulation, and also to assess the importance of triaxiality in cluster 
parameter estimation. 
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